Running ~5k Haliotis discus hannah ESTs through CD-HIT-EST
Sequence type DNA
No. sequences 2809
Longest sequence 1066
Shortest sequence 51
Average length 421
Total letters 1184035
Total N letters 2423
Total non N 1181612
Sequences with N 918
GC content distribution
GC content (%) No. sequences % sequences
0%-5% 0 0%
5%-10% 0 0%
10%-15% 0 0%
15%-20% 4 0.14%
20%-25% 9 0.32%
25%-30% 39 1.38%
30%-35% 96 3.41%
35%-40% 220 7.83%
40%-45% 503 17.9%
45%-50% 716 25.48%
50%-55% 828 29.47%
55%-60% 339 12.06%
60%-65% 53 1.88%
65%-70% 2 0.07%
70%-75% 0 0%
75%-80% 0 0%
80%-85% 0 0%
85%-90% 0 0%
90%-95% 0 0%
95%-100% 0 0%
length distribution
length No. sequences % sequences
50-99 11 0.39%
100-149 147 5.23%
150-199 178 6.33%
200-249 208 7.4%
250-299 263 9.36%
300-349 267 9.5%
350-399 288 10.25%
400-449 240 8.54%
450-499 253 9%
500-549 208 7.4%
550-599 195 6.94%
600-649 193 6.87%
650-699 177 6.3%
700-749 96 3.41%
750-799 42 1.49%
800-849 38 1.35%
850-899 3 0.1%
900-949 0 0%
950-999 0 0%
1000-1049 1 0.03%
1050-1099 1 0.03%
length distribution
length No. sequences % sequences
51-101 17 0.6%
101-151 153 5.44%
152-202 180 6.4%
203-253 216 7.68%
254-304 265 9.43%
305-355 280 9.96%
355-405 296 10.53%
406-456 246 8.75%
457-507 240 8.54%
508-558 196 6.97%
559-609 223 7.93%
609-659 180 6.4%
660-710 161 5.73%
711-761 84 2.99%
762-812 38 1.35%
813-863 30 1.06%
863-913 2 0.07%
914-964 0 0%
965-1015 0 0%
1016-1066 2 0.07%
Sequence type DNA
No. sequences 3275
Longest sequence 1066
Shortest sequence 51
Average length 413
Total letters 1355756
Total N letters 2978
Total non N 1352778
Sequences with N 1144
GC content distribution
GC content (%) No. sequences % sequences
0%-5% 0 0%
5%-10% 0 0%
10%-15% 0 0%
15%-20% 4 0.12%
20%-25% 8 0.24%
25%-30% 41 1.25%
30%-35% 105 3.2%
35%-40% 246 7.51%
40%-45% 571 17.43%
45%-50% 866 26.44%
50%-55% 975 29.77%
55%-60% 394 12.03%
60%-65% 62 1.89%
65%-70% 3 0.09%
70%-75% 0 0%
75%-80% 0 0%
80%-85% 0 0%
85%-90% 0 0%
90%-95% 0 0%
95%-100% 0 0%
length distribution
length No. sequences % sequences
50-99 12 0.36%
100-149 186 5.67%
150-199 214 6.53%
200-249 247 7.54%
250-299 307 9.37%
300-349 323 9.86%
350-399 361 11.02%
400-449 281 8.58%
450-499 284 8.67%
500-549 247 7.54%
550-599 216 6.59%
600-649 218 6.65%
650-699 194 5.92%
700-749 100 3.05%
750-799 42 1.28%
800-849 38 1.16%
850-899 3 0.09%
900-949 0 0%
950-999 0 0%
1000-1049 1 0.03%
1050-1099 1 0.03%
length distribution
length No. sequences % sequences
51-101 18 0.54%
101-151 193 5.89%
152-202 217 6.62%
203-253 253 7.72%
254-304 313 9.55%
305-355 341 10.41%
355-405 370 11.29%
406-456 283 8.64%
457-507 271 8.27%
508-558 235 7.17%
559-609 241 7.35%
609-659 209 6.38%
660-710 173 5.28%
711-761 86 2.62%
762-812 38 1.16%
813-863 30 0.91%
863-913 2 0.06%
914-964 0 0%
965-1015 0 0%
1016-1066 2 0.06%
Blasting on Inquiry…….Job 690
Running de novo with CLC 4.0
Assembly with beta ..
using Server to map reads back to get an idea of % mapped.